Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable hashes per tick #29659

Merged
merged 1 commit into from
Jan 12, 2023

Conversation

bw-solana
Copy link
Contributor

Problem

There currently exists no mechanism for being able to change the number of hashes per tick.

Summary of Changes

  1. Add new feature to control updating hashes per tick on epoch boundary (to DEFAULT_HASHES_PER_SECOND)
  2. When synchronizing PoH with a bank, update hashes per tick from the bank value
  3. Reset PoH from set_bank when a new hashes_per_tick value is detected
  4. Remove poh_config from PohRecorder

Note 4 above is the key change that gets things working. With naive implementation, we see PoH continue building up tick entries with old ticks_per_slot, and once the bank with updated ticks_per_slot gets assigned, it sends these old entries associated with the new bank and Blockstore barfs. We need to kill these old tick entries and restart with updated hashes_per_tick

Theory of Operation

PoH will derive appropriate hashes per tick from the associated bank for that slot (through the reset function which synchronizes bank and PoH). Banks will inherit hashes per tick from their parent (or load from snapshot) except in the special case where we activate the 'configurable hashes per tick' feature and force the child bank to DEFAULT_HASHES_PER_SECOND on epoch boundary. Thus, changing DEFAULT_HASHES_PER_SECOND should have no impact to what is actually happening on chain until the feature is activated. It will, of course, impact testing.

Testing

Testing these changes is difficult as we need to coerce a hashes per tick change to agitate things. The following testing has been performed on top of the normal buildkite stuff:

  • Set ClusterConfig's poh_config.hashes_per_tick to 500 (so that we will actually hash to tick), force apply_updated_hashes_per_tick to execute on every epoch boundary, and run local_cluster, poh, runtime, and core cargo tests. Note: through logging it was confirmed that we reset PoH from 500 to 12500 hashes per tick on epoch boundary and all entries sent to replay/other validators were aligned with bank.hashes_per_tick
  • New unit test added, test_feature_activation_idempotent ensures repeated calls to apply_feature_activations will not cause issues
  • Run validator on MNB with these changes (feature not activated, DEFAULT_HASHES_PER_SECOND unchanged)
  • Run validator on MNB with these changes (feature not activated, DEFAULT_HASHES_PER_SECOND increased)

@bw-solana bw-solana marked this pull request as ready for review January 11, 2023 20:55
Copy link
Contributor

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

might pay to spool up some test clusters of various sizes to activate a change in hashes_per_tick on and see how they do. can be done post merge since we won't likely backport this to 1.14

@bw-solana bw-solana linked an issue Jan 11, 2023 that may be closed by this pull request
@bw-solana bw-solana merged commit 69c4db2 into solana-labs:master Jan 12, 2023
@jstarry
Copy link
Member

jstarry commented Apr 12, 2024

I don't think the current implementation is ideal when setting a working bank which has skipped previous slots. In this case, the poh recorder has already computed poh ticks for the skipped slots and all of that time is wasted when poh is reset in PohRecorder::set_bank and those ticks are thrown away.. forcing the leader to recompute a few slots worth of ticks again.

I think a better approach would be to have PohRecorder::reset somehow look ahead to what the hashes_per_tick value should be for next_leader_slot and use that value whenever resetting poh. This could be done by delaying hashes per tick changes by an epoch and tracking a next_epoch_hashes_per_tick for an epoch before it's applied.

I think this is a relatively minor since we typically confirm a block shortly after the epoch boundary and therefore skipped slots should be minimal.

@bw-solana
Copy link
Contributor Author

I don't think the current implementation is ideal when setting a working bank which has skipped previous slots. In this case, the poh recorder has already computed poh ticks for the skipped slots and all of that time is wasted when poh is reset in PohRecorder::set_bank and those ticks are thrown away.. forcing the leader to recompute a few slots worth of ticks again.

I think a better approach would be to have PohRecorder::reset somehow look ahead to what the hashes_per_tick value should be for next_leader_slot and use that value whenever resetting poh. This could be done by delaying hashes per tick changes by an epoch and tracking a next_epoch_hashes_per_tick for an epoch before it's applied.

I think this is a relatively minor since we typically confirm a block shortly after the epoch boundary and therefore skipped slots should be minimal.

Chatted offline w/ @jstarry . Agree we are throwing out the ticks when calling set_bank, and we could avoid this if we know our next leader slot falls in the epoch where ticks are increased.

Note: we can start ticking at the updated rate even in the N-1 epoch under the assumption that all of the slots up until our leader slot will get skipped. If they don't, we'll just keep resetting the poh and retrying. If they do get skipped, we'll have the ticks at the right hash rate to prove passage of time (and thus won't have to dump them in set_bank).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature Gate: Update Hashes per Tick
3 participants